30 research outputs found
SOTIF-Compliant Scenario Generation Using Semi-Concrete Scenarios and Parameter Sampling
The SOTIF standard (ISO 21448) requires scenario-based testing to verify and
validate Advanced Driver Assistance Systems and Automated Driving Systems but
does not suggest any practical way to do so effectively and efficiently.
Existing scenario generation approaches either focus on exploring or exploiting
the scenario space. This generally leads to test suites that cover many known
cases but potentially miss edge cases or focused test suites that are effective
but also contain less diverse scenarios. To generate SOTIF-compliant test
suites that achieve higher coverage and find more faults, this paper proposes
semi-concrete scenarios and combines them with parameter sampling to adequately
balance scenario space exploration and exploitation. Semi-concrete scenarios
enable combinatorial scenario generation techniques that systematically explore
the scenario space, while parameter sampling allows for the exploitation of
continuous parameters. Our experimental results show that the proposed concept
can generate more effective test suites than state-of-the-art coverage-based
sampling. Moreover, our results show that including a feedback mechanism to
drive parameter sampling further increases test suites' effectiveness.Comment: accepted at IEEE ITSC 202
Machine Learning-based Test Selection for Simulation-based Testing of Self-driving Cars Software
Simulation platforms facilitate the development of emerging Cyber-Physical
Systems (CPS) like self-driving cars (SDC) because they are more efficient and
less dangerous than field operational test cases. Despite this, thoroughly
testing SDCs in simulated environments remains challenging because SDCs must be
tested in a sheer amount of long-running test cases. Past results on software
testing optimization have shown that not all the test cases contribute equally
to establishing confidence in test subjects' quality and reliability, and the
execution of "safe and uninformative" test cases can be skipped to reduce
testing effort. However, this problem is only partially addressed in the
context of SDC simulation platforms. In this paper, we investigate test
selection strategies to increase the cost-effectiveness of simulation-based
testing in the context of SDCs. We propose an approach called SDC-Scissor (SDC
coSt-effeCtIve teSt SelectOR) that leverages Machine Learning (ML) strategies
to identify and skip test cases that are unlikely to detect faults in SDCs
before executing them.
Our evaluation shows that SDC-Scissor outperforms the baselines. With the
Logistic model, we achieve an accuracy of 70%, a precision of 65%, and a recall
of 80% in selecting tests leading to a fault and improved testing
cost-effectiveness. Specifically, SDC-Scissor avoided the execution of 50% of
unnecessary tests as well as outperformed two baseline strategies.
Complementary to existing work, we also integrated SDC-Scissor into the context
of an industrial organization in the automotive domain to demonstrate how it
can be used in industrial settings.Comment: arXiv admin note: substantial text overlap with arXiv:2111.0466
Cost-effective Simulation-based Test Selection in Self-driving Cars Software
Simulation environments are essential for the continuous development of
complex cyber-physical systems such as self-driving cars (SDCs). Previous
results on simulation-based testing for SDCs have shown that many automatically
generated tests do not strongly contribute to identification of SDC faults,
hence do not contribute towards increasing the quality of SDCs. Because running
such "uninformative" tests generally leads to a waste of computational
resources and a drastic increase in the testing cost of SDCs, testers should
avoid them. However, identifying "uninformative" tests before running them
remains an open challenge. Hence, this paper proposes SDCScissor, a framework
that leverages Machine Learning (ML) to identify SDC tests that are unlikely to
detect faults in the SDC software under test, thus enabling testers to skip
their execution and drastically increase the cost-effectiveness of
simulation-based testing of SDCs software. Our evaluation concerning the usage
of six ML models on two large datasets characterized by 22'652 tests showed
that SDC-Scissor achieved a classification F1-score up to 96%. Moreover, our
results show that SDC-Scissor outperformed a randomized baseline in identifying
more failing tests per time unit.
Webpage & Video: https://github.com/ChristianBirchler/sdc-scisso
SBST tool competition 2021
​© 2021 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.We report on the organization, challenges, and results of the ninth edition of the Java Unit Testing Competition as well as the first edition of the Cyber-Physical Systems Testing Tool Competition. Java Unit Testing Competition. This year, five tools, Randoop, UtBot, Kex, Evosuite, and EvosuiteDSE, were executed on a benchmark with (i) new classes under test, selected from three open-source software projects, and (ii) the set of classes from three projects considered in the eighth edition. We relied on an improved Docker infrastructure to execute the tools and the subsequent coverage and mutation analysis. Given the high number of participants, we considered only two time budgets for test case generation: thirty seconds and two minutes.
Cyber-Physical Systems Testing Tool Competition. Five tools, Deeper, Frenetic, GABExplore, GABExploit, and Swat, competed on testing self-driving car software by generating simulationbased tests using our new testing infrastructure. We considered two experimental settings to study test generators’ transitory and asymptotic behaviors and evaluated the tools’ test generation effectiveness and the exposed failures’ diversity.
This paper describes our methodology, the statistical analysis of the results together with the contestant tools, and the challenges faced while running the competition experiments
JUGE: An Infrastructure for Benchmarking Java Unit Test Generators
Researchers and practitioners have designed and implemented various automated
test case generators to support effective software testing. Such generators
exist for various languages (e.g., Java, C#, or Python) and for various
platforms (e.g., desktop, web, or mobile applications). Such generators exhibit
varying effectiveness and efficiency, depending on the testing goals they aim
to satisfy (e.g., unit-testing of libraries vs. system-testing of entire
applications) and the underlying techniques they implement. In this context,
practitioners need to be able to compare different generators to identify the
most suited one for their requirements, while researchers seek to identify
future research directions. This can be achieved through the systematic
execution of large-scale evaluations of different generators. However, the
execution of such empirical evaluations is not trivial and requires a
substantial effort to collect benchmarks, setup the evaluation infrastructure,
and collect and analyse the results. In this paper, we present our JUnit
Generation benchmarking infrastructure (JUGE) supporting generators (e.g.,
search-based, random-based, symbolic execution, etc.) seeking to automate the
production of unit tests for various purposes (e.g., validation, regression
testing, fault localization, etc.). The primary goal is to reduce the overall
effort, ease the comparison of several generators, and enhance the knowledge
transfer between academia and industry by standardizing the evaluation and
comparison process. Since 2013, eight editions of a unit testing tool
competition, co-located with the Search-Based Software Testing Workshop, have
taken place and used and updated JUGE. As a result, an increasing amount of
tools (over ten) from both academia and industry have been evaluated on JUGE,
matured over the years, and allowed the identification of future research
directions
TEASER: Simulation-based CAN Bus Regression Testing for Self-driving Cars Software
Software systems for safety-critical systems like self-driving cars (SDCs)
need to be tested rigorously. Especially electronic control units (ECUs) of
SDCs should be tested with realistic input data. In this context, a
communication protocol called Controller Area Network (CAN) is typically used
to transfer sensor data to the SDC control units. A challenge for SDC
maintainers and testers is the need to manually define the CAN inputs that
realistically represent the state of the SDC in the real world. To address this
challenge, we developed TEASER, which is a tool that generates realistic CAN
signals for SDCs obtained from sensors from state-of-the-art car simulators. We
evaluated TEASER based on its integration capability into a DevOps pipeline of
aicas GmbH, a company in the automotive sector. Concretely, we integrated
TEASER in a Continous Integration (CI) pipeline configured with Jenkins. The
pipeline executes the test cases in simulation environments and sends the
sensor data over the CAN bus to a physical CAN device, which is the test
subject. Our evaluation shows the ability of TEASER to generate and execute CI
test cases that expose simulation-based faults (using regression strategies);
the tool produces CAN inputs that realistically represent the state of the SDC
in the real world. This result is of critical importance for increasing
automation and effectiveness of simulation-based CAN bus regression testing for
SDC software. Tool: https://doi.org/10.5281/zenodo.7964890 GitHub:
https://github.com/christianbirchler-org/sdc-scissor/releases/tag/v2.2.0-rc.1
Documentation: https://sdc-scissor.readthedocs.i
Cost-effective simulation-based test selection in self-driving cars software with SDC-Scissor
Simulation platforms facilitate the continuous development of complex systems such as self-driving cars (SDCs). However, previous results on testing SDCs using simulations have shown that most of the automatically generated tests do not strongly contribute to establishing confidence in the quality and reliability of the SDC. Therefore, those tests can be characterized as “uninformative”, and running them generally means wasting precious computational resources. We address this issue with SDC-Scissor, a framework that leverages Machine Learning to identify simulation-based tests that are unlikely to detect faults in the SDC software under test and skip them before their execution. Consequently, by filtering out those tests, SDC-Scissor reduces the number of long-running simulations to execute and drastically increases the cost-effectiveness of simulation-based testing of SDCs software. Our evaluation concerning two large datasets and around 12’000 tests showed that SDC-Scissor achieved a higher classification F1-score (between 47% and 90%) than a randomized baseline in identifying tests that lead to a fault and reduced the time spent running uninformative tests (speedup between 107% and 170%). Webpage & Video: https://github.com/ChristianBirchler/sdc-scisso
Cost-effective simulation-based test selection in self-driving cars software
Simulation environments are essential for the continuous development of complex cyber-physical systems such as self-driving cars (SDCs). Previous results on simulation-based testing for SDCs have shown that many automatically generated tests do not strongly contribute to identification of SDC faults, hence do not contribute towards increasing the quality of SDCs. Because running such "uninformative" tests generally leads to a waste of computational resources and a drastic increase in the testing cost of SDCs, testers should avoid them. However, identifying "uninformative" tests before running them remains an open challenge. Hence, this paper proposes SDCScissor, a framework that leverages Machine Learning (ML) to identify SDC tests that are unlikely to detect faults in the SDC software under test, thus enabling testers to skip their execution and drastically increase the cost-effectiveness of simulation-based testing of SDCs software. Our evaluation concerning the usage of six ML models on two large datasets characterized by 22'652 tests showed that SDC-Scissor achieved a classification F1-score up to 96%. Moreover, our results show that SDC-Scissor outperformed a randomized baseline in identifying more failing tests per time unit.
Webpage & Video: https://github.com/ChristianBirchler/sdc-scisso
Machine learning-based test selection for simulation-based testing of self-driving cars software
Simulation platforms facilitate the development of emerging Cyber-Physical Systems (CPS) like self-driving cars (SDC) because they are more efficient and less dangerous than eld operational test cases. Despite this, thoroughly testing SDCs in simulated environments remains challenging because SDCs must be tested in a sheer amount of long-running test cases. Past results on software testing optimization have shown that not all the test cases contribute equally to establishing con dence in test subjects' quality and reliability, and the execution of \safe and uninformative" test cases can be skipped to reduce testing effort. However, this problem is only partially addressed in the context of SDC simulation platforms. In this paper, we investigate test selection strategies to increase the cost-effectiveness of simulation-based testing in the context of SDCs. We propose an approach called SDC-Scissor (SDC coSt-effeCtIve teSt SelectOR) that leverages Machine Learning (ML) strategies to identify and skip test cases that are unlikely to detect faults in SDCs before executing them
TEASER : simulation-based CAN bus regression testing for self-driving cars software
Software systems for safety-critical systems like self-driving cars (SDCs) need to be tested rigorously. Especially electronic control units (ECUs) of SDCs should be tested with realistic input data. In this context, a communication protocol called Controller Area Network (CAN) is typically used to transfer sensor data to the SDC control units. A challenge for SDC maintainers and testers is the need to manually define the CAN inputs that realistically represent the state of the SDC in the real world. To address this challenge, we developed TEASER, which is a tool that generates realistic CAN signals for SDCs obtained from sensors from state-of-the-art car simulators. We evaluated TEASER based on its integration capability into a DevOps pipeline of aicas GmbH, a company in the automotive sector. Concretely, we integrated TEASER in a Continous Integration (CI) pipeline configured with Jenkins. The pipeline executes the test cases in simulation environments and sends the sensor data over the CAN bus to a physical CAN device, which is the test subject. Our evaluation shows the ability of TEASER to generate and execute CI test cases that expose simulation-based faults (using regression strategies); the tool produces CAN inputs that realistically represent the state of the SDC in the real world. This result is of critical importance for increasing automation and effectiveness of simulation-based CAN bus regression testing for SDC software